Live freelance tracking. Raw descriptions turned into structured data. Find your next tech project without the noise.
upwork.com π‘ 2026-05-08
πΉ Refactor Raw Medical Bill Data into Structured Spreadsheet
π€ Client: πΊπΈ USA Member since 2026-05-06
π° Price: ****
π© Problem: Extract structured data from medical bills for accurate input into a provided spreadsheet.
π¦ Existing: Not specified
Specifications:
[Target] - Extract key fields such as patient ID, date of service, procedure codes, and charges.
[Method] - Use optical character recognition (OCR) tools to read text from images or PDFs of medical bills.
[UI/UX] - Design a simple data entry form for manual input if OCR fails on specific documents.
[Stack] - Python with libraries like PyTesseract for OCR, Pandas for data manipulation and Excel for output.
[Security] - Ensure all health-related information is handled securely following HIPAA guidelines.
[Format] - Output structured CSV or XLSX files compatible with the provided spreadsheet.
Workflow:
1. Review sample medical bills to understand structure and identify key data points.
2. Implement OCR using PyTesseract to extract text from bill images/PDFs.
3. Manually input any extracted data that is unclear or requires context into a structured form.
4. Clean and format the extracted data in Python using Pandas for consistency.
5. Export cleaned data into a structured CSV/XLSX file compatible with the provided spreadsheet.